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ABSTRACT 


The  objective  of  the  project  was  to  develop  broader  formulations 
of  the  mathematical  (statistical)  theory  of  decisions.  This  final  report 
presents  two  broad  scope  generalizations  which  have  resulted  from  the 
project. 

The  first  generalization  discussed  is  a  decision-making  model 
which  applies  to  the  case  of  a  not-well-informed  decision  maker  with 
independent  data  sources.  In  this  model,  the  inference  about  the  prior 
distribution  is  determined  from  the  solution  of  an  adjunct  decision 
problem,  which  specifies  the  minimum  risk  hypothesis  in  the  light  of 
the  available  information. 

The  second  generalization  presented  is  a  model  of  multi-period 
decision  making  for  both  stationary  and  Markovian  environments.  In 
contrast  to  the  model  discussed  in  the  above  paragraph,  this  model 
does  not  assume  independent  data  sources,  i.  e.  ,  that  the  observation 
processes  are  not  affected  by  the  actions  of  the  decision  maker. 


iii 


TABLE  OF  CONTENTS 


Page 

I.  INTRODUCTION  1 

II.  DECISION  TASKS  WITH  INDIVIDUAL 

STATIONARY  DATA  PROCESSES  AND 
STATIONARY  ENVIRONMENT  2 

2.  1  Case  I:  Known  Environment  and 

Known  Data  Processes  2 

2.  2  Case  Hr  Known  Data  Sources, 

Known  g^(tf)  5 

2.  2.  1  Average  Model  5 

2.  2.  2  Double  Decision  Task  Model  8 

2.  2.  3  Computer  Progress  of  the  Models  10 
2.  2.  4  A  Numerical  Example  10 

III.  A  MODEL  OF  MULTI-PERIOD  DECISION 

MAKING  15 

3.  1  Markovian  Environment  17 

IV.  PROBLEM  AREAS  UNCOVERED  BY  THE 

RESEARCH  AND  POTENTIAL  EXTENSIONS  20 

REFERENCES  22 

APPENDIX  23 

ILLUSTRATIONS 

Figure  1.  Flow  Chart  of  Computer  Program  11 

Figure  2.  Representation  of  a  Numerical  Example  10 


SECTION  I 


INTRODUCTION 


The  objective  of  the  project  was  to  develop  broader  formulations  of 
the  mathematical  (statistical)  theory  of  decisions.  This  final  report 
presents  two  broad  scope  generalizations  which  have  resulted  from  the 
project. 

The  first  generalization  discussed  is  a  decision-making  model 
which  applies  to  the  case  of  a  not-well-informed  decision  maker  with 
independent  data  sources.  In  this  model,  the  inference  about  the  prior 
distribution  is  determined  from  the  solution  of  an  adjunct  decision 
problem,  which  specifies  the  minimum  risk  hypothesis  in  the  light  of 
the  available  information.  This  model,  together  with  the  corresponding 
averaging  model  of  Reference  1,  has  been  programmed  and  debugged  so 
that  in  future  work  it  will  be  possible  to  obtain  information  about  the 
behavior  of  the  model. 

The  second  generalization  presented  is  a  model  of  multi-period 
decision  making  for  both  stationary  and  Markovian  environments.  In 
contrast  to  the  model  discussed  in  the  above  paragraph,  this  model 
does  not  assume  independent  data  sources,  i.  e.  ,  that  the  observation 
processes  are  not  affected  by  the  actions  of  the  decision  maker.  Stated 
in  another  way,  in  this  generalization  we  are  dealing  with  strongly 
sequential  decision  tasks  in  which  the  decision  maker's  actions  can 
redesign  the  information  system. 

In  addition  to  the  decision-making  models  discussed  in  this  report, 
the  present  contract  has  resulted  in  an  additional  report  which  has  been 
separately  submitted  for  publication  as  a  Technical  Documentary 
Report.  This  report  is  entitled  "The  Karhunen-Loeve  Expansion  and 
Factor  Analysis"  and  was  written  by  Dr.  Satosi  Watanabe  of  Yale 
University. 

The  remainder  of  this  report  is  divided  into  three  sections. 

Section  II  discusses  the  decision-making  model  for  the  not-well- 
informed  decision  maker  in  a  stationary  environment  with  stationary 
and  independent  data  sources.  Section  III  discusses  the  multi-period 
decision-making  model  for  both  stationary  and  Markovian  environments. 
In  Section  IV,  problem  areas  uncovered  by  the  research  and  potential 
areas  for  future  research  are  discussed. 


SECTION  II 


DECISION  TASKS  WITH  INDEPENDENT  STATIONARY 
DATA  PROCESSES  AND  STATIONARY  ENVIRONMENT 


2.  1  Case  I:  Known  Environment  and  Known 
Data  Processes 


This,  of  course,  is  a  well  known  case  and  is  presented  here  as 
background  Let  us  assume  that  a  decision  maker  has  to  select  an  act 
out  of  a  collection  of  acts  {A^}  so  to  pursue  some  rational  objective. 

The  rational  objective  being  the  maximization  of  the  expected  utility  of 
his  decisions. 

Let  us  assume  that  the  utility  of  an  act  is  a  function  of  the  state  of 
nature  X*,  i.  e.  ,  that  for  each  ordered  pair  (A^  X*)  there  is  a  utility 
scalar  u(aK  X*).  Let  us  assume  further  that  the  decision  maker  is 
equipped  with  an  observation  (or  information)  system  which  outputs 
messages  belonging  to  the  collection  {Y1}.  The  information  system 
will  be  characterized  by  the  collection  of  conditional  probabilities 

{p(  y^x1)} 

We  will  also  assume  that  the  environment  can  be  characterized  by 
a  probability  distribution  over  the  states  [PlX1)}.  We  are  assuming, 
furthermore,  that  {^(Y^lx*)]  is  not  affected  by  the  action  of  the  decision 
maker  (i  e.  ,  the  decision  maker  is  not  redesigning  his  information  sys¬ 
tem  during  operation)  and  that  both  {P(Xi)}  and  [P(YJjX1)}  are  stationary 

The  problem  we  wish  to  solve  is  to  obtain  the  decision  rule 

*1  l 

A  J  =  D(YJ) 

where  A  j  is  the  act  which  maximizes  the  expected  utility  conditional  on 
receiving  the  message  Y^  from  the  information  system.  But  we  have 
that 


yJ 


(Ak)  =  £u(Ak  X1)  P(x’  I  yI 


Then  A 4  ^  is  given  by: 

*/(A"j)  =  ^u(A*j,  X1)  P(X* 1  Y^)  =  max  £u(Akf  X")  PCX^yI 
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by  Bayes  rule  we  have  that 


p(xi\  yj)  =  P(  1  X1)  PCX1) 


P(  YJ) 


V(  A5  =  max  /u(  Ak,  X*)  P(  X*)  PfY^X1)  — — — 

k  .  P(YJ) 

1 


Let  us  define  the  matrices  U,  D,  Q  as  follows: 


lu5ki  = 


{D}.. 
ij  _ 


Cq}.. 

ij 


u(Ak,  X1) 


=  0 


Pix1) 


P(YJ|X1) 


i  #  j 
i  =  j 


Then 


1 


U(A'})  =  max  )  {UD},  .{Q}.. 

k  “  k‘  1J  P(YJ) 


That  is, 


A*  =  - l—r~  max  UD  Cq] 


P(  YJ) 


row 


where  Cq],  is  the  j-th  column  of  Q,  and 
J 


*i  1 

V(A  J)  =  - - 


max  Cudq].  = L _  [udqJ”! 

P(YJ)  row  ^  P(Y^) 


* 


where  the  *  indicates  that  the  largest  component  of  the  included  vector 
is  to  be  taken. 
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Since  the  decision  rule  yields  the  value  of  the  index  *j  for  each 
possible  observation  Y^,  it  is  readily  seen  that  a  convenient  represen¬ 
tation  of  the  decision  rule  is 

6[y]  =  [udq]+ 


where  Y  is  the  row  vector  listing  all  possible  observations  and  the 
operator  +  substitutes  each  column  of  the  matrix  inside  with  the  row 
number  of  its  largest  component. 

The  performance  of  the  decision  maker  will  then  be  modeled  (in 
the  long  run)  by 

P(yV(a’J)  =2j[UDQ]^  =  [UDQ]‘  § 
j  j 

where  §  is  a  column  vector  of  all  ones  conformable  to  the  row  vector 

[UDQf. 

The  function  v(U,  D,  Q)  allows  us  to  set  up  a  measure  of  effective¬ 
ness  for  information  systems,  in  fact,  if  Qq  is  the  system  characterized 

by: 

P(YJ|X1)  =  P(YJ) 


i.  e.  ,  the  null  system,  the  effectiveness  of  Q  is  given  by 

H(Q)  =  V(U,  D,  Q)  -  V(U,  D,  Q0) 

In  utilizing  statistical  decision  theory  to  evaluate  decision  behavior, 
attention  must  be  paid  to  the  fact  that  the  assumptions  underlying  the 
formal  model  in  use  should  be  empirically  valid.  This  in  turn  generates 
an  incentive  to  develop  the  formal  theory  for  the  assumptions  of  as  many 
systems  as  possible. 

Of  the  assumptions  of  the  traditional  model  presented  above,  the 
two  which  are  most  likely  not  to  be  verified  in  the  experimental  situation 
are: 


that  the  decision  maker  knows  the  prior  distribution 

that  the  decision  maker  knows  the  statistical  proper¬ 
ties  of  his  observation  processes. 
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We  are  then  interested  in  developing  the  theory  so  as  to  remove  one  or 
both  of  these  assumptions  substituting  them  with  weaker  ones. 

A  decision  task  which  results  from  weakening  one  or  more  of  the 
assumptions  of  another  decision  task  will  be  said  to  be  a  degradation  of 
the  original  task  (Ref,  1). 

This  note  is  concerned  with  the  degradation  of  the  decision  task 
presented  above  which  occurs  when  the  decision  maker  is  assumed  to 
know  a  density  function  over  the  space  of  priors  rather  than  the  prior 
itself.  In  other  words,  the  decision  maker  is  assumed  to  have  some 
uncertainty  on  which  prior  actually  prevails.  We  first  consider  this 
degraded  task  from  the  viewpoint  of  Reference  1,  generalizing  its 
approach  to  include  any  independent  stationary  data  generating  process 
and  any  density  over  the  space  of  priors.  Next,  we  introduce  a  new 
formal  model  for  the  same  degraded  task;  in  this  model  the  uncertainty 
about  the  prior  distribution  of  the  basic  decision  task  is  not  eliminated 
by  averaging  over  the  space  of  priors,  but  is  eliminated  by  selecting 
that  prior  distribution  which  minimizes  the  subjective  risk  of  mis-inference. 
This  model  should  predict  behavior  which  is  more  conservative  than  the 
averaging  model  when  the  decision  maker  performance  is  strongly  sensi¬ 
tive  to  the  prior  distribution  assumed. 

2.  2  Case  II:  Known  Data  Sources,  Known  gt(77) 

2.  2.1  Averaging  Model 

Le t  us  indicate  with  n  the  vector  {P(X^),  P(X^),  .  .  .  ,  P(Xn)}  and 
with  gt(  7T)  the  density  function  which  prevails  at  time  t  over  the  space 
of  7 7  distributions  ( N-dimensional  simplex).  Instead  of  assuming  that 
the  decision  maker  knows  which  7 7  applies,  we  shall  assume  that  he 
knows  gt( TT )  over  the  space  of  77- distributions,  gt(  77)  of  course  will  be 
transformed  from  instant  to  instant  to  reflect  the  learning  the  decision 
maker  undergoes  by  observing  the  environment.  The  problem,  then,  is 
to  find  the  decision  rule  under  these  circumstances. 

In  this  case,  we  can  compute  the  expected  utility  of  an  act  A^ 
given  that  has  been  observed  and  that  77  is  assumed  to  be  the  prior 
distribution  as  follows: 

t'(Ak)Yjj  n  =  E  Yj!  =  £u(Ak,  X1)  PfX1!  YJ,  77) 

i 
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By  Bayes  rule  we  have 


P(X1|  YJ,  IT)  = 


P(Yj|x\  77)  P  (X1) 


77 


PJYJ) 

77 


where 


PJY}) 
P(Yj|x\  17) 


77. 

1 

the  marginal  probability  of  when 
77  is  assumed  to  be  the  prior  distri¬ 
bution. 

P(Y^|x1)  since  the  data  sources  are 
independent  from  the  characteristics 
of  the  environment. 


Then 


y(Ak) 


yj,  77 


£u(Ak,  X1)  P^X1)  PU^X1) 


1 


P  (YJ) 

77  7 


If  in  addition  to  the  matrices  U  and  Q  defined  above,  we  define  the  matrices 
and  as  follows: 


(D  3..  =  0 

i  * 

J  =  P  (X1)  =  77. 

i  = 

77  l 

{AJ..  =  0 

i  # 

17  lJ  =  P  <YJ) 

77 

i  = 

W  e  have 

^(Ak)vj  „  =  Y Cud  )  (qa-1)..  =  Cud  qa"1],. 

YJ,  77  Lx  77  kl  77  1J  77  77  Jkj 


i.  e.  ,  the  expected  value  of  the  act  A^S  given  that  Y^  is  observed  and 
the  prior  77  is  held,  is  the  kj-th  element  of  the  matrix 

UD  QA'1 

77  77 
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The  decision  rule  is  a  transformation  of  the  message  into  an  optimal 
aqt,  thus  it  requires  the  specification  of  one  input;  namely,  and  not 
Y1^  and  77.  It  follows  thus  that  in  order  to  compute  the  decision  rule,  we 
have  to  eliminate  77.  In  the  averaging  model,  one  computes  the  quantities 

^(Ak)Yj  =  ECt'(Ak)yj^ 

i.  e.  ,  by  expecting  over  7 7  and  then  uses  these  quantities  as  the  basis  for 
the  computation  of  the  decision  rule. 

The  optimal  act  upon  receipt  of  is  given  by 


*  1  k 

A  J)  =  max  V(A  ).  =  max 


gt(ir)v(A  )Yj;  ^dTT 


=  max 
row 


g Air)  [UD  QA 

t  TT  77 


-1 


]  .  d77 

J 


If 


gt(77)  [UD^QA” 1  ]d77  =  EtCUDffQA“1] 


the  decision  rule  is  simple 

6(Y)  =  I  E  [UD  QA~  1  ]  |  + 

t  7 T  ft 


The  learning  of  the  decision  maker  would  be  reflected  by  his  updating  the 
gt(77),  using  Bayes  rule,  i.  e.  , 


gt+1<») 


P  (Y^utgju) 

r  t 

JPr(Yj|77')gt(77,)d77' 


where  gt  +  l(T7)  Is  the  posterior  density  on  the  77 -distributions  after  having 
observed  Y-^.  Pr(YJ|77),  i.  e.  ,  the  probability  of  Y^  given  that  the  prior  77 

holds,  is  nothing  else  but  the  marginal  probability  of  YJ  computed  using 
that  77  as  the  prior.  That  is, 

P  ( Yj  1 77)  =  Y P(  Yj  |  X1)  77.  =  Y  Q.  .77.  =  {t7Q}  . 
r  Lx  1  L  ij  1  J 

i  i 
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i.  e.  ,  the  j-th  component  of  the  row  vector  7T  Q,  7T  being  a  row  vector. 
Thus, 


«t  +  l 


(7T) 


{^Q}. 

- - 1 -  gt(") 

J  {tt'q}.  g  (ir')dff 
J  t 


{ttQ}  . 

TioT  gt(7° 

J 


where  77  is  the  mean  vector  W.  R.  T.  g^(77)  °f  the  population  of  77  vectors. 
2.2.2  Double  Decision  Task  Model 

Let  us  consider  the  decision  task  in  this  case  to  be  composed  of 
two  parts:  a)  make  a  decision  about  which  77  prevails  in  the  environ¬ 
ment;  b)  make  a  decision  about  which  act  is  optimal  in  the  light  of  the 
so  chosen  77,  the  utility  structure  and  the  data  source  characteristics. 
The  first  decision,  the  one  about  77,  has  also  to  be  optimal  in  a  Bayesian 
sense. 

Let  us,  then,  formalize  the  first  decision.  For  simplicity,  we 
will  assume  that  there  is  a  discrete  population  of  77 1  s  and  77m  represents 
a  generic  member  of  such  a  population.  The  {77m}  is  then  the  collection 
of  states  of  the  world  for  the  first  decision  task.  The  messages  about 
the  world  are  still  the  Y*J  since  the  decision  maker  has  to  use  the  same 
data  sources  for  both  decision  tasks.  The  acts  are  the  {  &n]  where  CXn 
is  the  act  of  selecting  77n  as  the  value  of  77  to  be  used  in  the  second 
decision  task  (the  basic  task). 

In  order  to  obtain  the  decision  rule  for  the  first  task,  we  have  to 

have: 


A  prior  distribution  over  {77m} 


A  utility  structure  for  the  ordered  pairs  (a11,  77m) 

A  model  of  the  information  system  which  supports 
the  first  task. 

Let's  begin  with  the  model  for  the  information  system.  The  relevant 
information  system  is  represented  by  the  set  of  conditional  probabilities 
{P(  1 77m) }  since  the  77m,s  are  the  states  and  the  YJ's  are  the  available 

messages.  We  have  seen  above  that  P(YJ|77m)  is  the  marginal  probability 
of  Y^  computed  assuming  that  77  =  77m  and  that 

i  mv  r 

P(YJ|  77  )  =  [77  Q] 
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If  FI  is  the  matrix  such  that  its  m-th  row  is  the  vector  77m,  then  we  have 
that 


thus, 


Co)  .  =  Y  Cn]  {q}  . 

mj  <->  me  ej 
e 


Cl  =  IlQ 


7  ff™P(yj|xe) 


Thus  the  model  for  the  information  system  of  the  first  task  is  obtained 
by  premultiplying  the  model  for  the  basic  observation  processes  by  a 
matrix  whose  rows  are  the  various  possible  priors  of  the  basic  decision 
task. 


The  suitable  prior  distribution  for  the  first  task  can  be  obtained 
from  the  density  function  gt(77).  For  example,  if  the  state  77m  is  said 
to  obtain  in  the  region  Rm,  then  the  prior  probability  of  the  state  77m  is 


gt(n)  dtr  =  G(7Tm) 


m 


This  distribution  can  reflect  the  learning  of  the  decision  maker  from  one 
instant  to  the  next  through  the  simple  Bayesian  learning  process  discussed 
above  in  connection  with  the  averaging  model. 

Finally,  we  can  determine  which  is  the  proper  utility  structure  for 
the  first  decision  task.  Let  us  indicate  with  L(an,  77m)  the  utility  of 
assuming  that  77n  is  the  case  when  actually  77m  is  true.  This  utility  can 
be  computed  by  considering  the  difference  which  assuming  7 7n  instead  of 
the  true  77m  will  make  in  the  expected  return  of  the  decision  maker  in 
connection  with  the  basic  decision  task.  Clearly 

L(an,  77m)  =  [[UD  mQJ"'77  -  Cud  mQ]"]  § 

77  77 

trr  n 

E“l  •  77 

J  selects  the  column  components  which  would 
have  been  the  largest  in  the  expression  UD^nQ.  Since  in  general  these 
components  will  not  be  the  largest  of  their  column  for  the  expression 
UD^mQ,  the  quantity  L(Ctn,  77m)  is  zero  or  negative. 
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Then  the  decision  rule  is  to  select  7Tn  according  to 

[Ln 

Cj 

(The  rows  of  the  LDq^ matrix  being  associated  with  the  various  7Tm's 
and  the  columns  with  the  various  messages.  ) 

The  7Tn  so  obtained  would  then  be  used  to  select  the  act  for  the 
basic  decision  task  according  to  the  decision  rule 

[UD  nQ]+ 

2.2.3  Computer  Programs  of  the  Models 

A  general  purpose  FORTRAN  model  which  incorporates  the  formal 
models  discussed  in  Sections  2.  2.  1  and  2.  2.  2  has  been  written.  To 
clarify  the  computations  required  by  the  two  models,  a  brief  description 
of  the  control  structure  of  the  computer  program  is  presented  in  the  flow 
chart  presented  as  Figure  1.  As  it  can  be  seen  from  the  flow  chart,  one 
can  select  between  the  two  formal  models  by  operating  the  console  switch 
No.  1.  A  comparison  of  the  behavior  of  the  two  models  in  the  face  of  the 
same  observation  series  can  be  carried  out  quite  easily.  The  program 
and  all  its  subroutines  have  already  been  written  and  debugged  and  listings 
of  the  computer  programs  are  presented  in  the  Appendix.  Data  will  be 
collected  on  the  behavior  of  these  models  for  a  very  simple  decision  task. 
The  task  which  will  be  used  in  exercising  the  models  is  described  in  the 
next  section. 

2.2.4  A  Numerical  Example 


Assume  that  an  object  (for  example,  an  enemy  ship)  is  in  an  area 
divided  into  four  regions,  as  illustrated  in  Figure  2. 
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Figure  2 
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Let  us  assume  that  the  sensors  available  are  capable  of  identifying 
which  intervals  on  the  z  and  w  axis  "contain"  the  ship.  The  w-axis 
sensor  is  assumed  to  give  a  correct  response  with  probability  q^,  and 
the  z-axis  sensor  with  probability  q  .  The  action  of  the  decision  maker 
is  to  place  a  weapon  (bomb)  in  one  of  the  four  regions. 


From  the  above,  it  is  clear  that  there  are  four  states  of  the  world: 


X  =  Ship  is 

X2  =  Ship  is 

X3  =  Ship  is 
4 

X  =  Ship  is 
and  four  messages: 

=  Ship  is 

Y  =  Ship  is 

Y3  =  Ship  is 
4 

Y  =  Ship  is 
and  four  acts: 


in  00  region, 
in  01  region, 
in  10  region, 
in  1 1  region. 

reported  in  00  region, 
reported  in  01  region, 
reported  in  10  region, 
reported  in  1 1  region. 


A  =  Place  bomb  in  region  00. 
2 

A  =  Place  bomb  in  region  01. 

3 

A  =  Place  bomb  in  region  10. 

4 

A  =  Place  bomb  in  region  11. 


The  utility  structure  which  we  will  assume  models  the  destructive 
capabilities  of  the  weapon  specifies  a  return  of  two  units  of  utility  if  the 
bomb  is  placed  in  the  same  region  where  the  ship  is  located.  If  the  bomb 
is  placed  in  a  region  adjacent  to  where  the  ship  is  located,  the  utility  is 
only  one-half  the  previous  value.  Finally,  if  the  bomb  is  placed  in  a  non- 
adjacent  region,  the  return  is  null.  The  utility  matrix  is  then: 


X 


1 


A 

A 

A 

A 


1 

2 

3 

4 


2 

1 

1 

0 


1  1 
2  0 
0  2 
1  1 


0 

1 

1 

2 
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The  information  system  used  by  the  decision  maker  is  modeled 
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X 


X 


X 


Y 


1 


Y 


2 


Y 


3 


Y 


.4 


The  above  Q  model  reflects  the  assumption  that  the  two  axis  sensors  are 
mutually  statistically  independent. 

If  we  now  characterize  the  collection  of  alternate  prior  distributions 
that  the  decision  maker  is  willing  to  consider  and  a  distribution  over  them, 
we  will  have  all  the  inputs  for  both  decision-making  models. 

To  take  the  simplest  case  we  can  assume  that  the  decision  maker 
characterizes  the  environment  weakly,  i.  e.  ,  ignoring  any  statistical 
dependency  which  may  exist  between  the  two  coordinates  of  a  ship  loca¬ 
tion.  In  such  a  case,  the  prior  distributions  that  the  decision  maker  may 
entertain  are  of  the  form: 


p(  x1 )  =  (i-p1)(i-p2) 
p(X2)  =  (l-p^pj 
p(X3)  =  (i-p1)p2 
p(X4)  =  p2  p2 


where  p\  and  p£  are  respectively  the  probabilities  that  the  ship  w 
coordinate  and  z  coordinate  fall  in  the  second  interval  of  the  corres¬ 
ponding  axis. 

We  will  allow  the  number  pj  and  p£  to  assume  four  equidistant 
values  obtaining  16  different  priors.  The  16  priors  will  be  assumed  at 
the  onset  to  be  equally  likely.  Thus  we  have  defined  the  matrix  n  and 
the  vector  G(77m)  which  were  the  only  two  missing  inputs  for  our  models. 
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It  should  be  pointed  out  that  the  structure  of  the  computer  program 
described  in  Section  2.  2,  3  allows  extremely  flexible  redefinition  of  the 
decision  task.  Thus,  one  could  easily  modify  the  sensor  model  to  repre¬ 
sent,  for  example,  two  isotropic  sonar  buoys  placed  in  the  region  00  and 
1 1  by  simply  utilizing  the  following  Q  matrix: 


X 


X 


2  2 
l-(q  +  r  ) 


1  -  2q 
2 


X" 


X 


1  -  2q 
2 


2  2 
l-(q  +  r  ) 


Q 


Here  the  sensors  are  assumed  to  have  three  distinct  readouts:  same 
region  as  that  of  buoy;  one  of  the  two  regions  adjacent  to  buoy;  region 
furthest  from  buoy.  Also  we  are  assuming  that  a  misreading  between 
adjacent  readouts  can  occur  with  probability  q  and  between  non-adjacent 
readouts  with  probability  r;  furthermore,  multiple  misreadings  are 
assumed  to  occur  in  a  mutually  independent  manner. 
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SECTION  III 


A  MODEL  OF  MULTI-PERIOD  DECISION  MAKING 


We  will  begin  this  section  by  considering  multi-period  decision 
making  in  stationary  environments.  These  are  environments  described 
by  probability  distributions  which  are  time  invariant.  Let  Xm  be  a 
state  of  the  world,  A1  an  act,  0J  an  outcome.  Then  the  environment  can 
be  described  by  the  three-dimensional  array  of  probability  numbers 

tp(  0J  |  A1,  xm)} 

The  problem  we  want  to  solve  is  given:  A  prior  distribution  at  some 
time  t  over  the  states  of  the  world  {p^.(Xm)}  and  a  set  of  utility  numbers 
[uij  =  u(Ax,  0^)},  which  indicate  the  utility  which  accrues  to  the  decision 
maker  if  (his)  act  A1  is  followed  by  the  outcome  0^,  determine  the  optimal 
act  for  the  present  time  which  will  lead  to  maximizing  present  and  future 
payoffs . 

The  approach  followed  is  to  reduce  the  sequential  decision-making 
problem  to  a  static  one  by  computing  an  "equivalent"  utility  matrix 
U1  =  U  +  V,  where  U  =  and  V  =  {vjj}  is  the  maximum  expected 

utility  return  over  the  remaining  decision  points  which  can  accrue  once 
the  initial  act  A1  is  followed  by  the  outcome  0J.  In  other  words,  vjj  is 
the  maximum  expected  future  return  which  could  ensue  if  at  the  initial 
instant  the  pair  (A1,  0^)  obtained. 

Once  the  equivalent  utility  matrix  has  been  obtained,  one  can 
select  the  optimal  act  as  follows:  The  expected  value  of  the  i-th  act  is 

v(Af)  =  V  p 

t  ij 

j 

where  p  (©^Ia1)  is  given  by 

Pt(0J|Ai)=  £p(0J|a\  Xk)  pt(Xk)  (V) 

k 
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Then  one  selects  the  act  which  maximizes  such  value,  i.  e.  ,  the  A1<D 
such  that 

Zi  . 

p(0J|AO)u.'  =  max  Yp  (0J  |  A^u! .  (1) 

t  '  1  J  i  U  t  1J 

J 

would  solve  our  problem  if  the  quantities  {u-j}  where  available. 

Let  us  illustrate,  then,  how  to  compute  {u^j}  .  For  simplicity,  we 
will  consider  a  case  in  which  only  two  periods  need  be  considered.  In 
applying  this  formulation  tof  a  specific  problem,  one  has  to  determine 
the  value  n  such  that  the  computed  over  n  steps,  as  well  as  over 

n  + 1  steps,  are  "sufficiently 11  equal.  Such  n  is  the  minimum  amount  of 
future  which  needs  to  be  considered  for  the  specific  sequential  problem 
on  hand. 

i 

If  two  periods  need  to  be  considered,  then  since  ujj  —  u^j  +  v-j  , 
vjj  represents  the  maximum  expected  return  in  the  second  step  given 
that  in  the  first  step  (A1,  0J)  obtained.  But  such  a  quantity  is  simply 


where  the  quantity  p^^(0^|A  )  is  the  probability  of  0^(at  t+1)  given 
A 1  (at  t-f-1)  given  that  at  t,  (A*,  0^)  occurred.  This  probability  is  com¬ 
puted  as  follows: 


(i»  j) 

\  + 


rni. 


P.  Jl  (SV1)  =  ^p(0k|A1,Xm)p(t^:l1)  (x‘“) 

m 


(3) 


(i»  j) 


where  p^.  +  (Xm)  is  the  posterior  distribution  of  the  states  after  the 
occurrence  of  the(A  ,  0J)  pair  andp(0^|A^,  Xm)  is  a  member  of  the 
original  three-dimensional  array  which  describes  the  environment. 
The  distribution  pj*^^  (Xm)  is  in  turn  given  by: 


j)  jm.  i. 

Pt  +  i(X  )  =  Pt(X  |0»A) 


(4') 


4 


* 
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where 


Pt(X 


m  i 


A1) 


.  - j  |  i  ,  mv  ,,m. 
p(0J!A,X  )Pt(X  ) 

£  P(0J|  A1,  X^pJX1) 


(4) 


Thus  the  complete  solution  of  the  two-period  sequential  decision  problem 
in  a  stationary  environment  proceeds  as  follows:  Using  (4)  and  (4') 
(Bayes  Theorem),  one  obtains  the  posterior  distribution  over  the  states 
after  observing  the  outcome  of  the  first  act  (thus  this  model  accounts 
for  the  learning  performed  by  the  decision  maker  upon  acting  and 
observing  the  consequences  (outcome)  of  his  act). 

Next,  using  (3)  and  (2)  one  is  in  a  position  to  compute  the  "optimal- 
future- return"  vjj  of  the  pair  (A1,©-)).  Finally,  using  (1)  and  (l1),  one 
can  select  the  optimal  act  A1q  for  now.  This  process  can  be  repeated  as 
soon  as  the  outcome  is  observed  by  simply  replacing  p^(X^)  with  the 
Pt+1  (Xk)  w^ere  i  and  j  correspond  to  the  act  that  was  chosen  and  the 
outcome  which  actually  occurred.  Thus  we  obtain  the  selection  of  the 
optimal  act  for  each  decision  point  in  time,  each  selection  taking  into 
account  only  the  returns  to  be  expected  at  the  decision  point  and  the  one 
just  beyond  it. 

Extensions  of  this  schema  to  futures  of  more  than  two  adjacent 
decision  points  is  conceptually  straight  forward,  but  computationally 
cumbersome. 

3.  1  Markovian  Environments 


If  the  multi-period  decision  making  takes  place  in  a  non- stationary 
environment,  it  is  no  longer  possible  to  describe  the  environment  with 
the  probability  array  {p(0J|  A1,  Xm)}  since  this  array  is  independent  of 
time.  Let  us  consider  the  special  class  of  Markovian  environments.  An 
environment  is  said  to  be  Markovian  if  the  probability  array  which  con¬ 
trols  the  outcomes  depends  only  on  the  act  and  outcome  which  took  place 
in  the  previous  instant.  It  then  follows  that  a  Markovian  environment  is 
described  by  the  array 

f  ,  hi.k  A  i  A  j  „m. 

0t-rAt-i,At,xt  ^ 

whose  individual  entry  represents  the  probability  that  given  that  the 
present  (time  t)  act  and  state  are  respectively  AJ  and  Xm  and  given 
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that  the  previous  instant  (time  t-1)  was  characterized  by  act  A1  and 
outcome  0*S  the  outcome  0*1  will  occur  at  t  (now). 

If  one  realizes  that  the  only  difference  which  is  introduced  by  the 
environment  being  Markovian  is  that  one  can  no  longer  talk  about  a  fixed 
model  of  the  environment  {p(0^j  AJ,  Xm))  but  has  to  model  the  environ¬ 
ment  instant  by  instant  with  an  array 

{p^VlA-U"1)} 


where 


{p(K,  i) 


(eft|A^,xm)  =  p(9th|9^.A(ti.)1 


Aj,  Xm)J 
t  t 


where  the  brackets  around  k  and  i  indicate  that  these  indices  are  fixed 
in  the  array,  then  it  is  extremely  easy  to  generalize  the  procedure  given 
for  stationary  environments  to  the  case  of  Markovian  environments.  The 
generalized  procedure  is  given  briefly  below: 


To  solve  completely  the  two-period  decision  problem  in 
a  Markovian  environment,  one  proceeds  as  follows,  using 


(j,  h)  |  (k,  i) 

P  (Xm) 

t  +  1 


Pt(X 


m .  h  i‘  k  i 
9t’  At’  9t-l’  At-lJ 


(4b') 


p<9>u.A;-1.Aj,x^Pt(xm, 

Ip(9>^rALi'4x!»  pt<xl) 


p  (Xm|  0h,  A^,  ek  ,  A1  )  = 
*t  '  t  t  t-1  t-1 7 


(4b) 


1 


one  obtains  the  posterior  distribution  over  the  states. 


Equation  (4b),  which  is  Bayes  Theorem,  models  the  learning  process 
of  the  decision  maker  who  observing  the  datum  0^  under  the  conditions 
0^_1,  learns  how  to  better  discriminate  the  hypothesis  {Xm}. 

The  optimal  future  return  for  the  pair  (A^,  0^)  is  computed  as 
follows: 
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First  obtain 


P<ettilet'4Am) 


V  r  i  h  i  q  m  (j>h)j(k,  i) 

Xp<et+ 1 1 et ■  4  AtV  xt+i»pfl  <xm> 


(3b) 


m 


Notice  that  in  (3b)  one  is  using  the  array 

>r  I  n(h)  AU)  a<1 


^P(  6t  +  1 1  9t  '  t  •  At  +  !'  Xt  + 1, ) 


and  not  the  array 


r  /0h i4k)  A(i)  J  vmu 

^p(et  !et~r  At'i’  At,xt  " 


This  of  course  is  due  to  the  time  variant  character  of  the  environment. 
Once  (3b)  is  used,  one  can  compute  vj^  with: 


■jh=  max  Ip(9t+i|et'  4a’+i,u(^'a<1) 


Finally  using 


ujh  ujh +  vjh 


and 


*x®>f-rALi,4)=Z*x4">tr4-i-4*r>Pt‘*,n> 


m 


(2b) 


db") 


(lb1) 


and 


z4et  let-r At-r At°)uj  h  =  maxZp(0 

u  °  j  , 

h  n 

Jo 

we  can  select  the  optimal  act  A  for  the  time  t  which  is  optimal  when 
its  expected  consequences  are  considered  over  only  two  future  decision 
points.  Again  extensions  to  more  than  two  future  decision  points  are 
quite  straightforward  but  become  very  cumbersome  computationally. 


h|ek  ,  A1  .Aj)u' 
t  1  t-1  t-1  V  jh 


(lb) 
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SECTION  IV 


PROBLEM  AREAS  UNCOVERED  BY  THE  RESEARCH 
AND  POTENTIAL  EXTENSIONS 


The  decision-making  model  presented  in  the  first  part  of  this 
report  is  concerned  with  extending  decision  theory  to  the  case  where 
the  decision  maker  does  not  know  the  statistical  properties  of  the 
stationary  environment  in  which  he  is  assumed  to  operate.  The  model 
replaces  the  prior  distribution  with  an  ensemble  of  hypothetical  distri¬ 
butions,  i.  e.  ,  with  a  space  of  prior  distributions  and  an  associated 
density  function. 

This  density  function  represents  the  state  of  knowledge  of  the 
decision  maker  concerning  the  statistical  properties  of  his  environment. 
Thus  a  uniform  density  function  would  reflect  a  state  of  no  information 
and  a  Dirac's  density  function  would  correspond  to  the  state  of  perfect 
statistical  information.  Traditional  decision  theory  is  concerned  only 
with  the  above  two  extreme  cases.  This  information  state  is,  of  course, 
subject  to  modification  under  the  impact  of  empirical  evidence  supplied 
by  the  observation  system.  The  model  incorporates  empirical  evidence 
by  a  suitable  Bayesian  learning  submodel.  The  learning  model  used  is 
appropriate  for  a  stationary  environment. 

A  natural  extension  of  the  theory  is  the  incorporation  of  learning 
models  for  time-variant  environments.  In  the  following  paragraphs,  we 
would  like  to  briefly  describe  how  such  an  extension  could  be  obtained  in 
the  case  of  the  piece-wise- stationary  environments,  which  are  used  in 
the  experiments  reported  by  A.  Rapoport  in  Reference  2.  An  environ¬ 
ment  is  said  to  be  piece-wise- stationary  if  its  statistical  properties 
change  only  at  a  discrete  instant  in  time,  such  points  in  time  will  be 
referred  to  as  time  markers. 

Let  us  assume  the  simplest  piece- wise- stationary  environment, 
namely,  an  environment  with  a  single  time  marker.  The  ensemble  of 
hypotheses  for  such  an  environment  is  the  collection  of  vectors  { 77*^,  t,  77a] 
and  a  suitable  density  function  over  them.  The  symbol  77^  indicates  the 
prior  distribution  which  holds  before  the  marker,  while  77a  is  the  prior 
distribution  which  applies  after  the  marker  and  t  is  the  value  of  the  time 
marker. 

The  empirical  information  will  update  the  density  function  on  the 
hypothesis  space  via  the  mechanism  of  Bayes  rule.  The  selection  of  the 
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appropriate  hypothesis  has  to  be  done  by  solving  a  decision  theoretic 
schema  which  incorporates  the  differential  cost  structure  for  pairs  of 
hypothesis.  It  is  through  this  kind  of  mechanism  that  a  more  rapid 
discounting  of  past  observations  should  result  in  time-variant  environ¬ 
ments.  Mechanisms  which  rely  on  modifications  of  Bayes  theorem 
suggested  on  intuitive  grounds  are  not  acceptable  because  there  is  no 
guarantee  that  they  will  result  in  an  internally  consistent  formal  system. 
We  recommend  that  the  above  sketched  out  extension  of  the  theory  be 
developed  in  order  to  derive  a  normative  theory  for  the  experimental 
situations  presented  in  Reference  2. 

The  multi-period  decision-making  model  discussed  in  Section  III 
can  also  be  formulated  for  time-variant  environments  with  a  finite 
history.  For  decision  tasks  with  an  invariant  utility  structure  and  a 
well-informed  decision  maker,  the  multi-period  model  is  the  most 
general  decision-making  model.  Consequently,  further  work  on  this 
model  should  not  be  concerned  with  generalizing  it,  but  rather  with  the 
discovery  of  specialized  cases  which  yield  powerful  algorithms  for  the 
exercise  of  the  model. 
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APPENDIX 


This  appendix  contains  listings  of  the  executive  routine  and  the 
subroutines  for  the  computer  program  described  in  the  flow  chart  pre¬ 
sented  as  Figure  1  in  the  report. 
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C  DECISION  MODELS  1,2  16  DEC  1964 

C  PARTIALLY  INFORMED  DECISION  MAKING  MODELS  1,2  EXECUTIVE  PROGRAM 
C  THIS  PROGRAM  ALLOWS  THE  COMPUTATION  OF  THE  OPTIMUM  STRATEGY 

C  FOR  DECISION  MAKING  IN  A  STATIONARY  ENVIRONMENT  WITH  STATIONARY 

C  AND  INDEPENDENT  DATA  SOURCES  AND  NOT  WELL  INFORMED  DECISION  MAKER 
COMMON  I , N, NA , ND, NM, NO,DG,G ,PL ,KACT ,NY,OME,P I AV ,P I ,PR, Q,U 
C  DIMENSION  DG (ND,ND),G (ND),PL (ND,ND ),KACT (NO-1),NY (NO),OME(ND, NM) 

C  IP  I A  V  (N),PI  (ND,N),PR(N),QCN,NM),U(NA,N) 

DIMENSION  DG(16,16),G(16),PL(16,16),KACT(19),NY (20 ),OME( 16, 4 ) 

IP  IAV  (4),P  I  (16,4),PR(4),QC4,4),U(4,4) 

99  FORMAT  (514) 

READ  99,N,NA,ND,NM,N0 
98  FORMAT  (8  F6.3) 

READ  98,  (G(IG),IG-1,8) 

READ  98,  (G  (I  G),  I  G-9, 16) 

READ  98, ( (U  (I  U,  JU),  I U-  1 ,  4), JU- 1,2) 

READ  98,  ((U(IU,JU),IU-1,4),JU-3,4) 

C97  FORMAT (NO  12) 

97  FORMAT  (20  12) 

READ  97,  NY 
CALL  QSUB 
CALL  PI  SUB 
DO  100  I  PI  - 1 , ND 
DO  100  JQ- 1 ,NM 
OME(IPI  ,JQ)»0 
DO  100  KQ- 1 ,N 

100  OME  (I  P  I ,  JQ)-OME  (I  P I ,  JQ)+P  I  (IPI  ,KQ)*Q(KQ, JQ) 

IF  (SENSE  SWITCH  1)311,312 

311  CALL  CSUB 

312  DO  104  1-1, NO-1 
CALL  UPDATE 

I F  (SENSE  SWITCH  1)  110,111 

110  CALL  PRIOR 
GO  TO  113 

111  DO  112  IPR-1,N 

112  PR  (IPR)-PIAV  (IPR) 

113  I F  (SENSE  SWITCH  2)  102,103 

C 1 0 1  FORMAT ($SELECTED  PRI0R$/NF6.3) 

101  FORMAT ($  SELECTED  PR  1 0R$/4F6.3) 

102  TYPE  101, PR 

103  CALL  DEC  IS 

104  CONTINUE 

C202  FORMAT  (SPROGRAM  I NPUTS$///$N-$ , I 2,3X,$NA-$, I2,3X,$ ND-$ , I 2,$ NM-$ , 

C  I2,3X ,$NO-$, I 2//$Q  MATRI X$//NM(N  F6.3)//$PI  MATRI X$/ 

C  N  CMD/2  F6.3/6X,  ND/2  F6  »3//) //$OR  I G I  NAL  G  DISTRIBUTIONS/ 

C  ND/2 F6. 3/6X , ND/2 F6.3//$L I  ST  OF  OBSERVATIONS  $/NOI2/) 

202  FORMAT (SPROGRAM  I NPUTS$///$N-$ , I 2,3X,$NA-$ , l2,3X,$ND-$ , I 2,$ NV-$, 
1l2,3X,$NO-$, 1 2 //$ Q  MATRI X$//4(4F6.3/)//$P I  MATRIX  $/ 

14 (  8  F6.3/6X,  8  F6.3//)//$0R I G I NAL  G  DISTRIBUTIONS/ 

18  F6.3/6X,  8  F6.3//SLIST  OF  OBSERVATIONS  S/2012/) 
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TYPE  202,N,NA,ND,NM,NO,Q,PI  ,G,NY 
C313  FORMAT  (FUTILITY  MATRI X$/N  (NA  F6.3/) ) 

313  FORMAT  C/$UT I L I TY  MATRI X$/4(4  F6.3/)) 

TYPE  313, U 

C203  FORMAT  ($  DEC  I S I  ON  MAKER  STRATEGY$///$LAST  OBSERVATI 0N$,6X, NO-11 2/ 
C  $PRESENT  OBSERVATIONS, 3X, NO-1  12/SSELECTED  ACTS, 10X, NO-1  12) 

203  FORMAT  CSDECI SI  ON  MAKER  STRATEGY$///$LAST  OBSERVATIONS, 6X,  1912/ 
1SPRESENT  OBSERVATIONS, 3X,  19  1 2 /$ SELECTED  ACTS , 1 0X ,  19  12) 

TYPE  203,  (NY  (I  ),  1-1, NO-1),  (NY  (I  )  ,  I -2  ,NO) ,  (KACT  (I  ),  1-1,  NO-1) 

END 


SUBROUTINE  QSUB 

C  THIS  IS  THE  Q  MODEL  OF  EXAMPLE  1 

COMMON  I ,N,NA, ND,NM, NO,DG,G,PL ,KACT, NY,OME,P I AV,P I ,PR, Q,U 
C  DIMENSION  DG (ND,ND),G (ND),PL (ND,ND),KACT  (NO-1), NY (NO),OME(ND,NM) 
C  IP  I AV  (N) ,  P I  (ND,N),PR  (N),Q(N,NM),U(NA,N) 

DIMENSION  OG  (16, 16),G  (16),PL (16,16),KACT  (19), NY (20) , OME (16,4) 

1PI AV  (4), PI  (16, 4), PR  (4), Q(4, 4),U (4, 4) 

150  FORMAT  (2F6.3) 

READ  150,  Q 1 , Q2 

Q(1,1)-Q1*Q2 

Q(1,2)-Q2*(1.0-Q1) 

Q(1,3)«Q1*(1.0-Q2) 

Q(1,4)“ (1.0-Q1 )*(1.0-Q2) 

Q(2,1)-Q2*(1.0-Q1) 

Q(2,2)-Q1*Q2 

Q (2,3)- (1. 0-Q1)*  (1. 0-Q2 ) 

Q(2,4)“Q1#(1.0-Q2) 

QG,1)«Q1*(1.0-Q2) 

QG,2)-  (1 . 0-Q1  )* (1 . 0-Q2) 

QG,3)«Q1*Q2 

QG,4)-Q2*(1.0-Q1) 

Q(4,1>  (1.0-Q1)*(1.0-Q2) 

Q(4,2)-Q1*(1.0-Q2) 

Q(4,3)mQ2*(1.0-Q1) 

Q(4,4)-Q1*Q2 
C  QSUB  INPUTS 

201  FORMAT  (//  $  Q1  AND  Q2  VALUES  $//F6.3,3X,F6.3) 

207  TYPE  201, Q1,Q2 
RETURN 
END 


SUBROUTINE  PI  SUB 

C  DISTRIBUTION  THAT  RESULTS  IF  TWO  INDEPENDENT  PROB  AXIS  ARE 
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C  ASSUMED  TO  ACQUIRE  4  EQUAL  PROBABILITIES 

COMMON  l,N,NA,ND,NM,NO,DG,G,PL,KACT,NY,OME,PIAV,PI,PR,Q,U 
C  DIMENSION  DG(ND,ND),G(ND),PL(ND,ND),KACT(NO-1),NY(NO),OME(ND,NM) 
C  IP  I AV  (N)  ,P  I  (ND,N),PR(N),Q(N,NM),U(NA,N) 

DIMENSION  DG(16,16),G(16),PL(16,16),KACT(19),NY(20),OME(16,4) 

IP  I  AV  (4), PI  (16, 4), PR  (4),  Q(4,  4),U  (4, 4) 

C  DIMENSION  PI  (NPV) ,P2  (NPV) 

DIMENSION  PI  (4),P2 (4) 

128  FORMAT  (8F6.3,  13) 

READ  128, PI, P2, NPV 
ND«NPV*NPV 
DO  130  I P I  *  1 ,  ND 
DO  130  JP I ■ 1 , N 

130  PI  (I  P I ,  JP  I )- 0 

DO  131  I P2 “ 1 , NPV 
DO  131  IP  1- 1, NPV 
IP  I -  IP 1+NPV* (I  P2-1 ) 

PI  (IP  1,1)- (1.0-PI  (I  P 1))* (1. 0-P2  (I P2)) 

PI  (I  PI  ,2)-P1  (I  P1)*(1.0-P2  ( I P2 ) ) 

PI  (I  P 1 ,3)-  (1.0-PI  (I  P1))*P2  (1 P2) 

131  PI  (I  PI  ,4)-P1  (I  P1)*P2  (I  P2) 

C  PI  SUB  INPUTS 

C200  FORMAT (//$P1  AND  P2  VALUES  $//NPV  F6.3/NPV  F6.3) 

200  FORMAT  (//  $  PI  AND  P2  VALUES  $//4F6.3/4F6.3) 

TYPE  2 00 , P 1 , P2 

RETURN 

END 


SUBROUTINE  CSUB 

C  THIS  ROUTINE  COMPUTES  THE  LOSS  RESULTING  FROM  HYPOTHESIZING  PI 
C  RATHER  THAN  PJ  AS  THE  PRIOR 

COMMON  I , N,NA, ND,NM,NO,DG,G,PL ,KACT,NY,OME ,P I AV,P I  ,PR, Q,U 
C  DIMENSION  DG (ND,ND),G (ND),PL (ND,ND),KACT (NO— 1 ) > NY (NO),OME(ND,NM) 

C  IP  I A V (N ) , P I (ND, N) ,PR  (N) ,Q(N, NM) ,U (NA, N) 

DIMENSION  DG(16,16),G(16),PL (16,16), KA CT (19), NY (2 0 ) , 0 ME (16,4) 

IP  I AV (4), PI  (16, 4), PR  (4) ,Q (4, 4),U (4,  4) 

C  DIMENSION  DPI  (N,N),DP J  (N,N) ,C (NA, N) ,CJ  (NA ,NM  ,C I (NA,NM),EI  (NM),EJ (NM) 
DIMENSION  DPI  (4, 4) ,DPJ (4, 4),C (4, 4) ,CJ (4, 4) ,C I (4,  4),E I (4),EJ (4) 

DO  11  IPL-1,ND 
DO  11  JPL-1,ND 
DO  12  L-1,N 
DO  12  K-1,N 
DPI  (L,K)-0 
DPI  (L,L)-PI  (I PL ,L ) 

DPJ<L,K)-0 

12  DPJ  (L,L)-PI  (JPL,L) 

DO  1  M-  1,NA 
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DO  1  K-1,N 
C(M,K)-  0 
DO  1  L-1,N 

1  C(M,K)-C(M,K)+U(M,L)*DPJ  t,K) 

DO  2  M- 1 ,  NA 

DO  2  K-1,NM 
CJ  CM,  K)-  0 
DO  2  11-1, N 

2  CJ(M,K)«CJ(M,K)+C(M,1 1)*Q(1 1,K) 

VJ-0 

DO  5  K-1,NM 
EJ  (K)-CJ  (1,K) 

DO  4  L-2 ,NA 

I  F (EJ  (K)-CJ <L ,K))  3,3,4 

3  EJ  (K)-CJ  <L  ,K) 

4  CONTINUE 

5  VJ-VJ+EJ  (K) 

DO  6  M-  1 ,  NA 
DO  6  K-1,N 
C(M,K)-0 

DO  6  L-1,N 

6  C  (M,K)-C (M,K)+U (M,L)*DPI  <L,K) 

DO  7  M- 1,NA 

DO  7  K- 1 , NM 
Cl  (M,K)-0 
DO  7  11-1, N 

7  Cl  <M,K)-C I  (M,K)+C(M,  I1)*Q(I1,K) 

VI-0 

DO  10  K-1,NM 
El  (K)-CI  (1,K) 

EJ  <K)-CJ  (1  ,K) 

DO  9  L-2,NA 

I F (E  I  (K)-CI  (L,K))8,8,9 

8  El  (K)-CI  <L,K) 

EJ  (K)-CJ  4.,K) 

9  CONTINUE 

10  VI-VI+EJCK) 

I F (SENSE  SWITCH  4)301,11 

301  TYPE  300, DPI  ,DP J,C ,CJ,V J,C I ,VI 

300  FORMAT ($  DPI$/4(4F6.3/)//$DPJ$/4(4F6.3/)//$C$/4(4F6.3/)//$CJ$/ 
14(4F6.3/)//$VJ-$,F6.3//$CI$/4(4F6.3/)//$VI-$,F6.3) 

11  PL  (I  PL, JPL)-V l-V J 

I F (SENSE  SWITCH  3)303,304 

C302  FORMAT ($COST  OF  MIS  I NFERENCE$///  ND(ND/2F6.3/6X,ND/2F6.3//) ) 

302  FORMAT ($COST  OF  Ml  SI NFERENCE$///  16 (  8  F6.3/6X,  8  F6.3//)) 

303  TYPE  302, PL 

304  RETURN 
END 
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SUBROUTINE  UPDATE 

C  THIS  SUBROUTINE  UPDATES  THE  DISTRIBUTION  G(PI)  TO  ACCOUNT  FOR 
C  THE  RECEIPT  OF  THE  MESSAGE  IDENTIFIED  BY  NY  Cl ) . 

COMMON  l,N,NA,ND,NM,NO,DG,G,PL,KACT,NY,OME,PIAV,PI,PR,Q,U 
C  DIMENSION  DG  (ND ,  ND ) ,  G  (ND ) ,  PL  (ND ,  ND  ) ,  KA  CT  (N O- 1 ) ,  NY  (N 0) ,  OME (  ND , NM) 
C  IP  I AV  (N)  ,P  I  (ND,N),PR  (N),Q(N,NM),U(NA,N) 

DIMENSION  DG(16,16),G(16),PL(16,16),KACT(19),NY (2 0), OME ( 16,4) 

IP  I AV (4) ,P I  (16,4) ,PR  (4) , Q(4,  4) ,U (4,  4) 

C  DIMENSION  QJ  (N ) ,  PI  QJ  (ND ) 

DIMENSION  QJ  (4) ,P I QJ (16) 

JQ-NY(I) 

DO  50  IQ-1, N 

50  QJ(IQ)-Q(IQ,JQ) 

DO  51  I  PI -1,ND 
PI  QJ  (I  PI  )■  0 

DO  51  KP I  - 1 , N 

51  PIQJ  (IPI  )- PIQJ  (IPI  )  +  PI  (I PI  ,KP I  )*QJ  (KPI  ) 

DENO0 

DO  52  JG-1,ND 

52  DENO-DENOG  (JG)*P I  QJ  (JG) 

DO  53  I P I  - 1 , ND 

53  G (I  PI )-  (PIQJ  (IP  I )  *G  (I  PI  )  )/DENO 
DO  54  IDG-1, ND 

DO  54  JDG-1,ND 
DG  (I  DG,  JOG)-  0 

54  DG  (IDG,  I  DG)- G  (I  DG) 

DO  55  JP I  - 1 , N 
PIAVCJPD-0 

DO  55  I P  I  - 1 ,  ND 

55  P I  AV  (JP  I  )-P  I  AV  (JP  I  )+G  (I  P I  )*P  I  (IPI  ,  JP  l ) 

IF  (SENSE  SWITCH  3)306,307 

C305  FORMAT ($LAST  OBSERVATIONS,  I2//$G  NEW  DISTR$//2  (ND/2F6.3/)/ 

C  1/$AVERAGE  PR  I OR$/N  F6.3) 

305  FORMAT ($LAST  0 BSE RV AT  I  ON- $ , I2//$G  NEW  DISTR$//2(  8  F6.3/)/ 
1/SAVERAGE  PR  1 0R$/4  F6.3) 

306  TYPE  305,  JQ,G,PIAV 

307  RETURN 
END 


SUBROUTINE  PRIOR 

THIS  SUBROUTINE  SELECTS  THAT  COLUMN  OF  THE  MATRIX  PL-DG-OME 
WHICH  CORRESPONDS  TO  THE  NEXT  OBS  MESSAGE  AND  SELECTS  THE  PRIOR 
WHICH  CORRESPONDS  TO  ITS  LARGEST  ENTRY 
COMMON  I ,N,NA, ND , NM,NO,DG,G ,PL ,KACT, NY, OME, P IAV,PI,PR,Q,U 
C  DIMENSION  DG (ND,ND) ,G (ND) ,PL (ND,  ND),KACT (NO-1),  NY (NO),OME(ND, NM) 
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C  IP  I AV  (N),P  I  (ND,N),PR  (N),Q(N,NM),U(NA,N) 

DIMENSION  DG(16,16),G(16),PL(16,16),KACT(19),NY(20),OME(16,4) 
1PI  AV  (4), PI  (16,4),PR(4),Q(4,4),U(4,4) 

C  DIMENSION  DL(ND,ND),DLOJ(ND) 

DIMENSION  DL (16, 16), DLOJ  (16) 

DO  21  IL-1,ND 
DO  21  JD-  1,ND 
DL  (IL,  JD)-  0 
DO  21  KLD- 1, ND 

21  DL(IL,JD)-DL(IL,JD)+PL(IL,KLD)*DG(KLD,JD) 

DO  22  IL- 1 ,ND 
JOM-NYd+1) 

JOM  SELECTS  COLUMN  OF  OMEGA  THUS  OF  L-DG -OMEGA,  NY(I)  LISTS 
THE  IDENTIFIERS  OF  THE  MESSAGES  MAKING  UP  A  GIVEN  EXPERIMENT. 
NY(I +1 )  IS  THUS  THE  IDENTIFIER  OF  TH  PRESENT  OBSERVATION. 

DLOJ  (I  L)-  0 
DO  22  JD-1,ND 

22  DLOJ  (IL)-DLOJ  (IL)+DL  (IL,  JD)*OME  (JD,  JOM) 

AMAX IL-DLOJ  (1) 

DO  25  IP-1, N 

25  PR  (IP)-PI  (1 , 1 P) 

DO  24  IL-2, ND 

IF  (AMAX  IL-DLOJ  (I  L)  )23,23,24 

23  AMAX  I  L= DLOJ  (I  L) 

DO  26  IP-1, N 

26  PR  (I  P)*P  I  (I  L,  IP) 

24  CONTINUE 

I F (SENSE  SWITCH  3)309,310 

C308  FORMAT ($PRESENT  OBSERVAT ION-$, I2//$C0ST  OF  I NF$/2 (ND/2F6.3/) ) 

308  FORMAT ($P RESENT  OBSERVATION'S, I2//$C0ST  OF  INF$/2(  8  F6.3/)) 

309  TYPE  308, JOM, DLOJ 

310  RETURN 
END 


SUBROUTINE  DEC  IS 

C  THIS  SUBROUTINE  COMPUTES  THE  DECISION  RULE  (UDQ)+  AND  THE 

C  OPTIMAL  ACT  FOR  THE  ACTUALLY  OBSERVED  Y.IT  CAN  OUTPUT  DEC  IS  RULE 

COMMON  I , N, NA, ND, NM, NO,DG,G,PL ,KACT, NY,OME ,P I AV,P I ,PR, Q,U 
C  DIMENSION  DG (ND,ND),G (ND),PL (ND, ND),KACT (NO-1),NY (NO),OME(ND,NM) 

C  IP  I  AV  (N),P  I  (ND,N),PR  (N),Q(N,NM),U(NA,N) 

DIMENSION  DG (1 6, 16) ,G  (16) ,PL (16,16), K ACT (19),NY(20),OME(16,4) 

IP  I  AV  (4),  PI  (16, 4), PR  (4),Q(4,4),U(4,  4) 

C  DIMENSION  D(N,N),UD(NA,N),UDQ(NA,NM),AMAX(NM),KRULE(NM) 

DIMENSION  D  (4, 4),UD  (4, 4) ,UDQ(4, 4) , AMAX  (4),KRULE (4) 

DO  30  ID-1, N 
DO  30  JD- 1 , N 
D  (I  D,  JD)-0 
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30 


D(ID,ID)-PR  (ID) 

DO  31  IU-1,NA 
DO  31  JD-1,N 
UDCIU,  JD)m0 
DO  31  KU-1,N 

31  UD(I  U,  JD)“UD  (I  U,  JD)+U  (I  U,KU)*D  (KU,  JD) 

DO  32  IU-1,NA 

DO  32  JQ-1,NM 
UDQCIU,  JQ)«0 
DO  32  KD-1,N 

32  UDQ  (I  U,  JQ)-UDQ  (I  U, JQ)+UD  (I  U,KD)*Q(KD, JQ) 

DO  34  J Q“  1 ,  NM 

KRULE  (JQ)“  1 
AMAX (JQ)* UDQ(1,  JQ) 

DO  34  IU-2,NA 

I  F (AMAX  (JQ)-UDQ(I  U,  JQ))33,33,34 

33  AMAX  (JQ)-UDQ  (I  U,  JQ) 

KRULE  (JQ)«  III 

34  CONTINUE 

I F (SENSE  SWITCH  3)36,37 

35  FORMAT ($COST  OF  ACT$/  4(  4  F6.3/)//$KRULE$/  4  12) 
C35  FORMAT ($COST  OF  ACT$/NM(NA  F6.3/)//$KRULE$/NM  12) 

36  TYPE  35, UDQ, KRULE 

37  JQ“  NY  (I  +1 ) 

C  KACT(I)  IS  TAKEN  OBSERVING  MY(I  +  1) 

KACT  ( I ) - KR ULE  (JQ) 

RETURN 

END 
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